A Noisy 10GB Provenance Database

نویسندگان

  • You-Wei Cheah
  • Beth Plale
  • Joseph Kendall-Morwick
  • David B. Leake
  • Lavanya Ramakrishnan
چکیده

Provenance of scientific data is a key piece of the metadata record for the data's ongoing discovery and reuse. Provenance collection systems capture provenance on the fly, however, the protocol between application and provenance tool may not be reliable. Consequently, the provenance record can be partial, partitioned, and simply inaccurate. We used a workflow emulator that models faults to construct a large 10GB database of provenance that we know is noisy (that is, has errors). We discuss the process of generating the provenance database, and show early results on the kinds of provenance analysis enabled by the large provenance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

مروری بر مطالعات اُبسیدین در ایران، منشأیابی معادن و اُبسیدین های محوطه های باستانی، پژوهش ها و پرسش های موجود

Obsidian artifacts is frequently used materials in prehistory and found widely in archaeological sites. Provenance studies of obsidian has been an issue of intense research and debate between archaeologists and geologists. Since different provenance studies has been carried out from 1960s up to 2015 in Anatolia and Caucasus but obsidian studies in Iran is in very early stage and consider as ter...

متن کامل

Grouping Provenance Information to Improve Efficiency of Access Control

Provenance is defined in some literature as a complete documentation of process that led to an object. Provenance has been utilized in some contexts, i.e. database systems, file systems and grid systems. Provenance can be represented by a directed acyclic graph (DAG). In this paper we show an access control method to the provenance information that is represented by a directed acyclic graph and...

متن کامل

Deciding How to Store Provenance

Provenance of a file is metadata pertaining to the history of the file. Provenance, unlike normal metadata stored in file systems, is retrieved primarily by running queries. This implies that provenance has to be indexed and should have a query interface. We believe that databases are the most appropriate place to store provenance as they provide both indexing and query capabilities. The goal o...

متن کامل

Provenance and Probabilities in Relational Databases: From Theory to Practice

We review the basics of data provenance in relational databases. We describe different provenance formalisms, from Boolean provenance to provenance semirings and beyond, that can be used for a wide variety of purposes, to obtain additional information on the output of a query. We discuss representation systems for data provenance, circuits in particular, with a focus on practical implementation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011